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Pediatric lymplioid leul<emia has the highest cure rate of all pediatric malignancies, yet due 
to its prevalence, still accounts for the majority of childhood cancer deaths and requires 
long-term highly toxic therapy. The ability to target B-cell ALL with immunoglobulin-like 
binders, whether anti-CD22 antibody or anti-CDI 9 CAR-Ts, has impacted treatment options 
for some patients. The development of new ways to target B-cell antigens continues at rapid 
pace. T-cell ALL accounts for up to 20% of childhood leukemia but has yet to see a set of 
high-value immunotherapeutic targets identified. To find new targets forT-ALL immunother- 
apy, we employed a bioinformatic comparison to broad normal tissue arrays, hematopoietic 
stem cells (HSC), and mature lymphocytes, then filtered the results for transcripts encoding 
plasma membrane proteins. T-ALL bears a core T-cell signature and transcripts encoding 
TCR/CD3 components and canonical markers of T-cell development predominate, espe- 
cially when comparison was made to normal tissue or HSC. However, when comparison 
to mature lymphocytes was also undertaken, we identified two antigens that may drive, or 
be associated with leukemogenesis;TALLA-1 and hedgehog interacting protein. In addition, 
TCR subfamilies, CD1, activation and adhesion markers, membrane-organizing molecules, 
and receptors linked to metabolism and inflammation were also identified. Of these, only 
CD52, CD37, and CD98 are currently being targeted clinically. This work provides a set of 
targets to be considered for future development of immunotherapies forT-ALL. 
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INTRODUCTION 

The origin of leukemia may be best understood through the spe- 
cific genomic mutations that aher the regulation, growth, and 
differentiation of the lymphocytic cells that comprise the disease. 
These mutations both define the biology of the disease at pre- 
sentation and can also reveal the genetic origins of the leukemic 
stem cell. The promise of next-generation sequencing technolo- 
gies to more fully describe these mutations is starting to be fulfilled 
as exomic or genomic sequencing has begun to supplement and 
may someday replace traditional methods of leukemia diagno- 
sis and classification based on pathology, immunophenotyping, 
and molecular cytogenetics (including FISH, fluorescence-based 
in situ hybridization, and PGR, polymerase chain reaction, for 
known genetic lesions) (1). 

Genomic technology, however, cannot stand on its own as ver- 
ification of target expression is still required at the protein level. 
Thus, it is immunophenotyping that ultimately informs the field 
of immunotherapeutics whether or not a genetic target could serve 
as therapeutic target for either antibody or T-ceUs transduced 
to express chimeric antigen receptors (GAR-Ts). The advent of 
GD19-GAR-T-cell therapy has impacted the treatment of pre-B- 
cell ALL for some patients with advanced disease. Indeed, we and 



others have proposed a number of targets that may be suitable 
for pediatric B-ALL (2, 3). However, attractive targets for T-ceU 
leukemia have yet to be recognized and exploited. We present here 
potential targets for treating T-cell ALL with antibody or GAR- 
Ts, using strategies developed for the analysis of pediatric solid 
tumors and B-ALL (4). 

In 1993, Pui et al. reviewed ontogeny marker expression in T- 
ALL in light of normal T-cell antigen expression during thymic 
development (5). T-ALL was considered as either prothymocyte- 
(expressing GD7), early thymocyte- (expressing GD5, GD2, and 
GDI), intermediate thymocyte- (GDI, GD4, or GD8), or mature 
thymocyte-like (GD3 and TGR surface expressed). The GDI anti- 
gen, expressed on cortical thymocytes, Langerhans cells, and a 
subset of B-cells, is the only one of these developmental antigens 
to be turned off upon reaching T-cell maturity. Reinherz originally 
proposed that T-ALL be classified along the lines of GDI and GD3 
expression with stage I (early thymocyte) expressing GD2, GD5, 
GD7, and no GDI, GD4, GD8, or GD3; stage 2 (intermediate) 
expressing GDI, GD2, GD5, and GD7 with variable 4 and 8, and 
weak GD3; and stage 3 (mature) expressing GD2, GD3, GD5, GD7, 
and GD4 or GD8 (usually only one or the other) (6, 7). In most 
simplistic terms, mature or medullary T-ALL expresses surface 
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CDS, but not CDla. Cortical or thymic T-ALL expresses CDla, 
but not surface CD3; and early T-ceU precursor T-ALL (ETP-ALL, 
which encompasses Pro-T-ALL and Pre-T-ALL) does not express 
CD3 or CDla. 

The answer to the challenge of finding T-cell restricted targets 
(that is a mature T-cell antigen present on ALL that can be safely 
eliminated, as CD19 for B-cells), or a more T-ALL restricted target 
(especially for the more immature forms of the disease) may lay 
in the nature of the progenitor cell itself As elegantly presented 
by the St. Jude - Washington University Pediatric Cancer Genome 
Project, early precursor T-cell ALL shares many similarities to dou- 
ble negative thymocytes that have the potential to differentiate into 
cells of either T-cell or myeloid lineage (8, 9). This pluripotency 
makes the antigenic expression profile for T-ALL far more gener- 
alized. At the other end of the spectrum, the most mature forms of 
T-ALL may benefit from new immuno therapeutic approaches that 
target the T-cell receptor, specifically, different subclasses that have 
clonally expanded. Although this was once deemed an approach 
to be of little interest due to the low number of cases, and the 
need for an almost individualized treatment approach, the suc- 
cess of CAR-T-ceD therapy, which is the essence of personalized 
or individualized medicine, has brought this approach to the fore 
once more. 

DATA INTERROGATION AND RESULTS 

T-cell ALL (acute lymphocyte/lymphoblastic leukemia) accounts 
for 15-18% of all childhood leukemias and 25% of ALL in adults 
(5, 10). However, we have yet to see a set of high- value targets 
proposed for T-cell ALL as we have for B- or pre-B-ALL. To that 
end we undertook a bioinformatics approach to describe the cell 
surface proteins expressed on T-ALL. Although we have begun to 
accumulate and analyze T-ALL cases at the Pediatric Oncology 
Branch of the NCL a large data set was recently made available at 
Gene Expression Omnibus (GEO) using the Affymetrix Human 
Genome U133 Plus 2.0 platform. In this submission, GSE47051, 
over 100 cases of pediatric T-ALL and pre-B-ALL are available for 
analysis. The samples were of high quality and derived from bone 
marrow or peripheral blood aspirates of pediatric ALL patients 
enrolled in two co-operative group trials (NOPHO, Nordic Society 
for Pediatric Hematology-Oncology), fuUy described in the origi- 
nal publications for this sample set ( 1 1 , 1 2 ) . Samples were obtained 
under informed consent, approved by the Regional Ethical Board 
in Uppsala, Sweden, and in accordance with the guidelines set forth 
in the Declaration of Helsinki. 

By navigating to the GEO home page, the GEL files for these 
annotated samples were accessed and analyzed. To test the accuracy 
of the pathological categorization of this sample set, we carried 
out hierarchical clustering using the freely available ArrayMining- 
tool (Online Microarray Data Mining Tool)' (13). In Figure 1 we 
demonstrated the partitioning of the pre-B-ALL and T-ALL sam- 
ples, confirming their original pathological classification. More- 
over, the genes driving this classification are unmistakably well- 
recognized components of ALL, Table 1. The T-ALL set contains 
CD3D, CD7, and CD28; while in the pre-B-ALL set we find CD 19, 
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CD79B, and CD22. In Table 1 we present the top 10 cell surface 
genes with regard to Pearson correlation (gene vs. outcome), and 
in Figure 2, we present the top four overall scores, as expressed by 
their respective sample sets. Not surprisingly, the gene expression 
values that drove this classification were highly divergent from 
each other, such as CD3D for T-ALL and CD19 and EBFl (early 
B-cell factor 1 ) for pre-B-ALL. Having confidence in the submitted 
sample classification, we then analyzed the T-ALL and pre-B-ALL 
samples for differences in average gene expression in comparison 
to broad normal tissue arrays. For each cell surface membrane 
transcript, an average expression level was calculated, and then 
compared to the average expression of the transcript in normal 
tissue, thus defining a potential space for immunotherapeutic 
intervention. Methods for this analysis were published previously 
(4). To confirm our analysis, we also analyzed normal and disease 
gene expression files using the Partek® Genomics Suite™ v6.5. The 
GEO database was used to download individual CEL files into the 
software suite, and data analyzed for quality, differential expres- 
sion, and with statistical packages for differential gene expression 
(14-16). 

Table 2 lists the top 40 genes for each disease category, as a 
function of the ratio of an individual transcript's expression level 
in ALL vs. a set of 117 normal tissues, our broad normal tissue 
array. A third of the transcripts (15 of the 46 transcripts, 33%) 
were present in both pre-B-ALL and T-ALL. These transcripts 
are by definition shared more between the two leukemias than 
with normal tissue. They are (bold italics in the table): PTGER4, 
ITGA4, CD37, CD52, CD62L (L-selectin), CXCR4, CD69, EVI2B 
(CD361), SLC39A8, MICB, LRRC70, CLELC2B, HMHAl, LSTl, 
and CMTM6 (CKLFSF6). 

SHARED PRE-B-ALL AND T-ALL GENES 

SLC39A8 is the easiest to classify as this zinc transporter is a 
solute carrier protein expressed during inflammation. In short, 
it is a response to metabolic demand (17). The next group of 
markers that naturally fall together has to do with core hom- 
ing and adhesion functions of lymphocytes, specifically CD37, 
a tetraspanin that associates with integrins; CD49d, an integrin 
expressed on activated lymphocytes; CD62L, a homing recep- 
tor; CD52, an adhesion receptor; CXCR4, whose natural ligand 
is SDF- 1 and mediates homing to the bone marrow. Tetraspanins 
organize plasma membrane domains in order to co-ordinate sig- 
naling and cellular functions. In B-cells, the CD81/CD21/CD19 
co-receptor complex interacts with CD82 and lipid rafts, while 
in T-cells the CD151/CD81/CD37/Tssc-6 complex is thought to 
perform a similar structural-signaling role (18, 19). This link- 
age to the tetraspanin CD37 is significant, as our data now 
link the expression of CD37 with known adhesion receptors 
of the integrin family in ALL. Anti-CD37 antibody is cur- 
rently being evaluated in adult B-cell malignancies (NHL, CLE, 
MM, ALL) as a drug conjugate with the maytansine derivative 
DM1 (20). 

CXCR4 is over-expressed in more than 20 different cancers, 
and expression in normal tissues (other than in the CD34 stem 
cell niche in bone marrow) is measurably lower (21-23). CD49d 
(alpha 4, beta 1/beta 7 integrin) is an adhesion receptor that medi- 
ates both rolling and firm adhesion to endothelium (24). CD49D 
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FIGURE 1 I Hierarchical clustering ofT-ALL and pre-B-ALL. 

Samples from GEO data set GSE47051 were downloaded and then 
clustered without reference to their diagnostic category (T-ALL or 
B-ALL). The top genes driving the subsequent clustering are listed in 



expression in tandem with CXCR4 has been studied in pediatric 
and aduh ALL. In children with ALL, no difference in expression 
levels was correlated to outcome, while adults with high CXCR4 
and low CD49D (VLA-4) did worse (25). 

CD62L is the canonical homing receptor for lymphocytes to 
enter secondary lymphoid tissues via high endothelial venules. 
Recently, targeting CD62L has been proposed for CLL, and thus 
it may be a viable target in pediatric ALL (26). CD52 is the 
well- characterized target of the anti-CAMPATH- 1 antibody alem- 
tuzumab, and is expressed on activated T-cells. This represents 
a second opportunity to more fully explore in pediatric ALL. 
CLEC2B (C-type lectin domain family 2B), is encoded in the 
NK cell killer gene complex on chromosome 12, and is also 
known as activation induced C-type lectin of T-cells, called AICL 
(27). AICL/CLELC2B map adjacent to CD69 whose C-type lectin 



the X-axis, and the disease categories of those samples on the 
/-axis. Gene expression (logjcs) was normalized by z score 
(x-mean/std.dev.) across all leukemia samples for purposes of 
comparison. 



domain is similar in sequence. CD69, a well-described activation 
marker for T-cells has also been found to be valid marker for prog- 
nosis in CLL, reflecting its ability to be up-regulated in either cell 
lineage (28). MICE (MHC class I polypeptide-related sequence 
B), is a ligand for NKG2D, and it thus activates T or NK cells. The 
shedding of the MICB protein is thought to help tumors evade NK- 
mediated immunosurveUlance (29). Although not expressed on 
pre-B-ALL, CD96 is also an immunoglobulin superfamily mem- 
ber, and we discuss it here because it is also a ligand for activated 
T or NK cells. CD96 has also been described as a leukemia stem 
cell marker in AML (30). Since the expression of CD96 is low 
or absent on normal hematopoietic stem cells (HSC), Staudinger 
et al., have proposed using anti-CD96 antibody to purge bone 
marrow during hematopoietic stem cell transplantation (HSCT) 
for the disease (31). 
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Table 1 |Top 10 cell surface genes driving classification. 
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Genes were clustered, and then ordered according to Pearson correlation co-efficient (PC, gene vs. outcome). The genes scoring as a positive (T-ALL markers) or 
negative (B-ALL markers), confirm the original designation In GEO. 
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FIGURE 2 I Gene expression values, across all samples, for top 
scoring hits. The average and range of values of the top four genes are 
presented in this box and whisker plot, demonstrating that CD3D and 



CD7 forT-ALL and CD19 and EBF1 for BCP-ALL (B-ALL) can be used to 
readily distinguish and verify the original pathological classification of 
these disease samples. 



CD361 {EVI2B) is a recently described antigen that was clas- of ALL (32). This gene is found in an intron of the NFl (neu- 
sified in 10th human leukocyte differentiation antigens (HLDA) rofibromatosis type 1) gene, along with EVI2A and OMpg, being 
conference as part of the B-cell panel, showing specific staining transcribed in the opposite direction from NFl (33). Little is 
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Table 2 |T-ALL and B-ALL cell surface immunotherapy targets. 
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The left hand side of the table lists the top scoring hits forT-ALL, and the right side of the table shows the top hits for pre-B-ALL, expressed as the ratio of expression In 
the leukemia over the expression of that transcript in normal tissue (117 samples covering all major tissue types of the body). These values were calculated using the 
ANOVA module in PartekGenomics Suite (Partek, Inc., St. Louis, MO, USA), where T tests, ratio of average expression, and the associated p values were calculated. 
Rank, indicates the numerical order list, with the TCR subunits for T-ALL and the t^HC class II genes for B-ALL marked with a letter so as to indicate the presence 
of very similar hits. The affy ID generating the highest scoring hit is reported, and hits for identical genes were not repeated. The comment column is used to report 
alternate nomenclature. Boxes are used to indicate genes we have previously described as over-expressed in B-ALL (4). Gene symbols in bold Italic are present In 
both T-ALL and B-ALL lists. The highest p value for all the analyses presented in this table was 2.2 x 70"'°, and thus p values are not presented in the table. 
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known of its function, and it may be over-expressed simply 

because of its genomic location. The CMTM6 (CKLF-like MAR- 
VEL transmembrane containing 6, CKLFSF6) is a chemokine-like 
gene that also contains a domain that bears sequence similar 
to tetraspanins. Thus, it appears to play both a signaling and a 
membrane-organizing role. The biological function is still being 
investigated. 

LRRC70 (leucine-rich repeat containing 70/synleurin), is also 
not well-described and may play a role in cell adhesion or may bind 
a growth factor (34). Few functional studies have been carried out 
on the LRR protein family. PTGER4, is a prostaglandin receptor. 
This G-protein-coupled receptor is known to activate T-ceUs. Lym- 
phocytic colitis was recently shown to be associated with increased 
TNFA, IFNG, and PTGER4, and a role for prostaglandin pro- 
posed in activating pathogeniclymphocytes (35). LSTl (leukocyte 
specific transcript 1 ) is expressed normally in cells of myeloid lin- 
eage, serving as transmembrane adaptor protein, interacting with 
the SHP-1 and SHP-2 phosphatases (36). There is no described 
association with LSTl and ALL to this point. 

Mutis et al, generated cytotoxic T-cells restricted to the non-self 
HLA molecule HA- l/MMHAl (histocompatibihty, minor, HA-1), 
and proposed using this reactivity to actively target residual dis- 
ease during HSCT (37, 38). The restriction of this antigen to the 
hematopoietic system highlights the fact that HSCT, which is often 
part of ALL therapy, could be augmented by targeting leukemia- 
expressed HA- 1 . In clinical trials, HA- 1 and HA-2 peptide vaccines 
are being tested for the induction of increased graft vs. leukemia 
effect following allogeneic HSCT (NCT00943293). In reviewing 
this list of genes over-expressed in both T-ALL and pre-B-ALL, in 
comparison to normal tissue, a general picture of immime activa- 
tion arises. While some of the proteins encoded by the transcripts 
expressed by both T-ALL and B-ALL are not as well-characterized, 
they stiU fit this theme and are also likely to be expressed through- 
out the immune system. Obvious therapeutic correlation can be 
made for CD52 and CD37. Antibody to CD96 is used for purging 
marrow, because it is too broadly expressed in normal tissue to be 
targeted by an active agent. The ability of MICB and CD96 to acti- 
vate T-ceUs or NK may also be an important insight, as adoptive 
immunotherapy with T-cells engineered to express NK ligands has 
been demonstrated and their presence may therefore be associated 
with an activated mature phenotype (39). Also of importance to 
immunotherapy is the presence of minor antigen (HA-1) in ALL 
that may be exploited during HSCT. We now turn our attention 
to the transcripts expressed in the disease most in need of new 
approaches, pediatric T-ALL. 

T-ALL GENES 

Among the T-ALL transcripts Hsted as being over-expressed 

in Table 2, 7 of 46 are TCR chains (TCRBV5-4, TCRBCl, 
TCBCl/TCRBV, TCRDC, TCRG2, TCR delta, and TCRGC2) and 
4 are members of the CD3 complex (CD3-delta, -epsilon, -gamma, 
and -zeta), making 11 of 46 (23%) of the T-ALL hits in Table 2 
directly related to clonotypic T-cell marker expression. This analy- 
sis does not inform us directly if these molecules are on the 
surface, as the presence of the surface protein CD3-epsilon is 
the key diagnostic criterion for classification of T-ALL as being 
mature/medullary type, but many of these are re-arranged TCR 



chains indicating the T-ALL data set we are analyzing is weighted 
toward a more mature phenotype. Thinking of these re-arranged 
proteins as immunotherapeutic targets will be discussed later. 

Transcripts for many canonical normal T-cell markers are 
also seen in T-ALL: such as, CD2, CD7, CD 11 a, CD28, CD45R, 
CD45AP, CD84, CD99, IL2RG, IL-7R, and the CDl family mark- 
ers CD IE and CD IB. If we consider this set of transcripts as an 
aggregate diagnostic of a single case, the presence of CD2 and CDl 
transcripts, along with CD7 would classify this sample as a corti- 
cal or thymic T-ALL, which is in keeping with the predominance 
of re-arranged TCR transcripts detected (as opposed to an early 
T-cell precursor type ALL). CD84 is in the SLAM (signaling lym- 
phocyte activation molecule) subset of the CD2 family of proteins, 
and mediates T-cell activation through homotypic adhesion (40). 
CD28 is the well-characterized second signaling molecule, CDl lA 
(LFA- 1 alpha chain) is a well-recognized lymphocyte integrin, and 
CD45 is LCA (leukocyte common antigen). CD45AP is known to 
positively regulate CD45 function. The transcripts for IL2 and IL7 
receptor chains are also in keeping with the T-ceU transcriptome 
of T-ALL. CD99 is a well-characterized marker of T-ALL and in 
pediatric disease its high level of expression has been proposed as 
a marker for detecting minimal residual disease along with CD3, 
CD7, and/or TdT (41). Hence, this set of transcripts is largely a 
picture of a normal activated immune cell. 

The expression of CDl A is a key diagnostic for T-ALL, repre- 
senting the key transition, along with high expression of CD2, to a 
more mature phenotype. In our analysis, we detected transcripts 
for CDIB and CDIE. CDl A is thought to be the primary cell sur- 
face CDl family member while CDIB interacts with intracellular 
lipids and CDIE functions as a lipid chaperone, reviewed in Ref 
(42). Nevertheless, CDIB clearly presents mycobacterial antigens 
to T-cells and is on the surface (43). In studying LEFl mutations 
in pediatric T-ALL, Gutierrez et al. demonstrated that this sub- 
set of pediatric T-ALL was arrested at the cortical stage and they 
demonstrated the presence of CDIB and CDIE by gene expres- 
sion profiling (44). We can infer from this that the CDl family 
is induced as a group, and also hope that unique subsets of Hpid 
antigens may be present on T-ALL. None have yet been reported. 

Three immune cell adhesion receptors are over-expressed in 
T-ALL. ICAM-3 (CD50) is constitutively expressed on most B- 
ceU malignancies, except for Hodgkin lymphoma, and myeloid 
cells, but appears to be absent from other tissues, with the possi- 
ble exception of tumor-associated vasculature (43, 46). We report 
here that it is likely to be found on pediatric T-ALL as well. HMMR 
(hyaluronan-mediated motility receptor, RHAMM, or CD168), is 
found in a complex containing BRCAl in breast tissue where it may 
govern apicobasal polarity (47). TCRs recognizing RHAMM were 
generated in a model system and foimd to inhibit AML growth in 
a xenogeneic system (48). However, these T-cells also recognized 
CD34-I- HSC in an HLA-A2 restricted manner. Thus, an active 
T-cell population targeting HMMR could only be envisioned for 
clinical use in an HLA mismatch setting. 

LPAR6, lysophosphatidic acid receptor 6, is another interest- 
ing transcript that may be expressed either for its contribution 
to cellular transformation or it may be part of an active genomic 
locus, as it is transcribed from an intron of the RB gene (49). 
On its own, this G-coupled receptor is known to stimulate cell 
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activation upon encountering its ligand, lysophosphatidic acid, 
and may play a role in tumor cell tissue invasion (50). Like the 
zinc transporter, another T-cell metabolic protein, ORAI2 (ORAI 
calcium release-activated calcium modulator 2) is over-expressed 
in T-ALL. It is likely to function in a manner similar to ORAIl 
where STIMl activates the translocation of this calcium channel 
to the surface of the T-cell upon depletion of intracellular calcium 
stores (51). 

Three transcripts expressed by T-ALL are a little more enig- 
matic, but reflect the fact that we are dealing with a transformed 
cell population. HHIP (Hedgehog interacting protein) interacts 
with all forms of sonic hedgehog (SHH), a secreted factor that 
guides tissue formation during development, and is a key medi- 
ator of its function (52). HHIP function on T-ALL has not been 
explored, but an AML study has described the ability of AML- 
derived stromal cells to express HHIP and to interact with AML 
stem cells that now re-express SHH and its receptor, smoothened 
(SMO) (53). TSHR (thyroid stimulating hormone receptor) is 
not normally considered a T-cell antigen, but mutations of this 
receptor have been associated with increased vasculature and 
the development of pituitary adenomas (54, 55). Clinical studies 
with TSHR primarily focus on Grave's disease. P2RY8 (purinergic 
receptor P2Y, G-protein-coupled 8), is best known as supplying 
the promoter that drives CRLF/TSLPR overexpression, upon dele- 
tion of an intervening pseudoautosomal region, in certain cases of 
B-ALL (56, 57). In a fascinating study, Fujiwara et al., screened for 
transforming genes in the rare biphenotypic acute leukemia (BAL) , 
which bears both lymphoid and myeloid markers (58). Theyfound 
P2RY8 expression to activate a number of cellular activation path- 
ways, including those mediated by CERB and Elk-1. Thus, P2RY8 
may be over-expressed simply by residing in a very active genomic 
region or it may play a functional role in transforming leukemia 
cell clones. 

In order to further analyze T-ALL gene expression signatures, 
we sought to determine if the normal gene sets we used for 
comparison had an impact on the type of transcripts identified 
as being over-expressed. We ran two subsequent analyses using 
CD34-I- bone marrow-derived HSC, and normal peripheral blood 
mononuclear cells, as the normal tissue comparators. Table 3. HSC 
CEL files were submitted by Dehashis Sahoo to GEO as GSE32719 
by the Weissman Lab at Stanford University, USA (59). We selected 
CD34-I- HSC that was from normal donors aged 19-31 years old 
(n= 14). For normal PBMC, we used control patients from the 
GEO Submission GSE21942 by Kemppinen and Saarela at the 
University of Cambridge, UK («= 15) (60). In the top half of 
Table 3, where comparison is made to HSC, a profile similar to 
the normal tissue array of Table 2 is seen, where T-ceUs tran- 
scripts predominate. In fact 60 and 70% of the transcripts (ranked 
by T score and fold-change, respectively), were shared. However, 
when comparison was made to PBMC, the percentage of shared 
transcripts dropped to 10 and 15% for T score and fold-change, 
respectively. Only three transcripts were shared between the lists 
featured in the top and bottom half (HSC vs. PBMC) Table 3 
shows CD3D (the delta chain of the TCR complex), HHIP, and 
TALLA-1 (TSPAN7). The presence of T-cell associated like tran- 
scripts has been discussed, but HHIP and TALLA- 1 warrant special 
attention. 



In a series of papers in the 1980s, Seon et al. described a unique 
human T-cell leukemia antigen identified by a monoclonal anti- 
body, designated it TALLA (61, 62). Immunotoxin-conjugated 
antibody to TALLA was found to control the growth of leukemic 
xenografts in nude mouse models (63). The antigen, a tetraspanin, 
is now referred to as TSPAN7, is expressed from the X chromo- 
some, and has been most recently been explored with regard to 
its role in differentiating glutamatergic neurons (64). As discussed 
above, little is known about HHIP and leukemia. As HHIP is likely 
to regulate stem cell-like characteristics of either leukemia cells, or 
stromal cells that support them it should be considered a high- 
value target (53). Also of interest was our detection of SLC7A5, 
which is also known as CD98, in the T score but not fold-change 
column of Table 3, where comparison is made to PBMC. A human- 
ized monoclonal antibody to CD98, IGN523, is currently being 
tested in refractory or relapsed AML (NCT02040506). CD98 is 
composed of two chains, and has the ability to signal though both 
mTOR and AKT pathways (65). 

All of the gene expression data are presented in our fully 
annotated Oncogenomics database^. 

DISCUSSION 

In seeking to analyze a series of pediatric malignancies for potential 
immunotherapeutic targets, we did not have a data set sufficiently 
robust to analyze pediatric T-ALL. In the data presented here, 
we now can add T-ALL to our list of pediatric malignancies that 
have been analyzed at the transcript level. Future studies will 
be enhanced by our ability to look at next-generation sequenc- 
ing data, and to perhaps identify differential expression between 
tumor and normal at even finer levels, such as comparing alter- 
native splicing profiles. Although we did not discuss our data for 
pre-B-ALL in a transcript by transcript level, our results were con- 
sistent with our earlier work looking at a more restricted set of 
normal tissues (4). The antigens CD19, CD79A, CD79B, CD49d, 
CD53, CD72, TLRl, and MILRl were all reported previously. 
Thus, our pre-B-ALL analysis sets are consistent. New and poten- 
tially interesting hits that arose in this new analysis for B-ALL 
include CD52, CD24, and CDIO/CALLA. Therapeutic antibodies 
have been developed for all of them. 

The analysis of pediatric T-ALL is novel, and seems to reflect the 
perplexing nature of this disease in that the majority of the over- 
expressed transcripts are likely to be broadly expressed throughout 
the immune system. Thus CD2, CD69, and CD99 are all very risky 
targets that may require more sophisticated approaches than a sin- 
gle antibody or CAR-T. This holds true for CD96 as well, which 
is being used to purge bone marrow, but is too broadly expressed 
to be targeted systemically. We do have some reasonable antigens 
to target, but the data may be telling us something else. It may be 
time to consider more fuUy the use of clonotypic antibodies or 
targeted CAR-Ts to permanently eliminate specific T-cell subsets. 

In the clinic, minimal residual disease can be detected in >95% 
of ALL patients by PCR detection of patient-specific rearrange- 
ments of the TCR or the immunoglobulin locus. This argues that 
at a genetic level the TCR is a valid target for therapeutic devel- 
opment (66). Campana et al. looked carefully at the expression 
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Table 3 | Comparison of T-ALL to HSC and PBMC noimal gene expression profiles. 

Rank Symbol Gene title 7 score Ranic Symbol Gene title 



T-ALL VS. PBMC 



Fold-change 



1 




CD3d molecule, delta 


50.7 


1 




H CD3d molecule 


140.1 


2 


CD3G* 


CD3g molecule, gamma 


24.2 


2 


TRBC1 • 


T-cell receptor beta constant 1 


110.7 


3 


TRBC1* 


TCR beta constant 1 


22.1 


3 


CD1E* 


CDIe molecule 


54.6 


4 


CD99* 


CD99 molecule 


20.8 


4 


TRDC* 


T-cell receptor delta constant 


42.9 


5 


CD7* 


CD7 molecule 


20.1 


5 


TRDV3 


T-cell receptor delta variable 3 


39.4 


6 


B2M 


Beta-2-microglobulin 


18.6 


6 


CD2* 


CD2 molecule 


39.2 


7 


CD96* 


CD96 molecule 


17.2 


7 


CD28 


CD28 molecule 


31 


8 


IL2RG* 


Interleukin 2 receptor, gamma 


17.0 


8 


CD3E* 


CD3e molecule, epsilon 


28.6 


9 


CD3E* 


CD3e molecule, epsilon 


16.9 


9 


CD7* 


CD7 molecule 


26.6 


10 


LIME1 


Lck interacting transmembrane adaptor 1 


16.1 


10 


CD1B* 


GDI b molecule 


26 


11 


CD46 


CD46 molecule 


16.1 


11 


HHIP* 


Hedgehog interacting protein 


23.3 


12 


CD247* 


CLJZ47 molecule 


16.0 


12 


CD96* 


LUyb molecule 


22.9 


13 


ITGB1 


Integrin, beta 1, CD29 


14.3 


13 


CD99* 


CD99 molecule 


22.1 


14 


LAX1 


Lymphocyte transmembrane adaptor 1 


13.5 


14 


CD3G* 


CD3g molecule, gamma 


21.2 


15 


ITGAE 


Integrin, alpha E, CD103 


13.4 


15 


AQP3 


Aquaporin 3 (gill blood group) 


20.7 


16 




Hedgehog interacting protein 


12.9 


16 


IL-7R* 


Interleukin 7 receptor 


20.5 


17 


ITM2A 


Integral membrane protein 2A 


12.4 


17 


LPAR6 


Lysophosphatidic acid receptor 6 


18.5 


18 


CXCR4* 


Chemokine (C-X-C motif) receptor 4 


12.2 


18 


CD8A 


CD8a molecule 


18.5 


19 


CD28* 


CD28 molecule 


12.1 


19 


IL2RG* 


Interleukin 2 receptor, gamma 


18.1 


20 


IGHM 


Immunoglobulin heavy constant mu 


12.1 


20 


PVRIG 


Poliovirus receptor rel. Ig domain cont 


16.4 










21 




Tetraspanin 7 TALLA-1 


16.1 



1 


TFRC 


Transferrin receptor (p90, CD71 ) 


21.2 


1 


TSPAN7 


Tetraspanin 7 TALLA-1 


55.0 


2 




1 Tetraspanin 7TALL1 


19.8 


2 




1 Hedgehog interacting protein 


23.7 


3 


TBXA2R 


Thromboxane A2 receptor 


19.5 


3 


HMMR* 


CD168-RHAMM 


12.8 


4 


CD3D* 


CD3d molecule, delta (CD3-TCR complex) 


19.4 


4 


IGSF10 


Ig superfamily, member 10 


11.4 


5 


FAF1 


Fas (TNFRSF6) associated factor 1 


16.4 


5 


VANGL1 


VANGL planar cell polarity protein 1 


6.8 


6 


HHIP* 


Hedgehog interacting protein 


14.6 


6 


TFRC 


CD71-transferrin receptor 


5.3 


7 


SLC39A8 


SLC 39 (zinctransp.), 8 


14.5 


7 


TRO 


Trophinin 


5.0 


8 


A1BG 


Alpha-1-B glycoprotein 


14.2 


8 


CACNB3 


Ca-F-I- channel, volt.-dep., beta 3 sub 


4.3 


9 


0T0P2 


Otopetrin 2 


13.4 


9 


MAGED1 


Melanoma antigen family D, 1 


4.2 


10 


ITGAE 


Integrin, alpha, CD103 


13.3 


10 


CD3D* 


CD3d molecule, delta (CD3-TCR complex) 


4.1 


11 


TI\/IEM237 


Transmem. prot. 237 (tetraspanin) 


12.9 


11 


CHRNA5 


Cholinergic R, nicotinic, alpha 5 


4.1 


12 


TRO 


Trophinin 


12.6 


12 


FAF1 


Fas (TNFRSF6) associated factor 1 


3.1 


13 


SLC22A7 


SLC 22 (organic anion transporter), 7 


12.5 


13 


PTK7 


Protein tyrosine kinase 7 


3.1 


14 


SLC7A5 


CD98/amino acid transporter, light chain 


12.2 


14 


Al BG 


Alpha-1-B glycoprotein 


2.8 


15 


CACFD1 


Ca-h- 1- channel flower dom. cont. 1 


11.8 


15 


GPC2 


Glypican 2 


2.4 


16 


IGH 


Immunoglobulin heavy locus 


11.1 


16 


LRRC37A3 


Leucine-rich repeat containing 37 A3 


2.2 


17 


TNFRSF21 


TNF receptor superfamily, member 21 


11 


17 


IGH 


Immunoglobulin heavy locus 


2.2 


18 


VANGL1 


VANGL planar cell polarity protein 1 


10.8 


18 


MRC2 


Mannose receptor, C-type 2 


2.1 


19 


FGFR1 


Fibroblast growth factor receptor 1 


10.6 


19 


SCNN1A 


Na-h channel, non-volt.-gated 1 alpha sub 


2.1 


20 


GPC2 


Glypican 2 


10.6 


20 


LGR4 


Leucine-rich repeat cont. GPCR 4 


1.9 



As forTable2, T-ALL gene expression profiles were compared to either CD34 (+) HSC (top half of table) or PBMC (bottom half of table). Data are reported and ranked 

according to T score (left half of table) or fold-change (right half of table). Both Gene Symbol and Gene Title are given for each transcript. Those transcripts that are 
represented in Table 2 (where comparison was to an array of normal tissues) are followed by an asterisk in the Gene Symbol column. Those transcripts appearing in 
both the top and bottom half of the table are in bold, with gray shading. 



of intracellular vs. membrane-expressed CDS and of the TCR in 

T-ALL (''-'^). In a series of ALL where patients ranged in age from 6 
to 65 (median age, 20), 1 1 of 40 showed CD3 and either TCRAB (9 
of 1 1 ) or TCRGD (2/ 1 1 ) on the membrane. Cytoplasmic TCR was 
detected in 12/40; while 17/40 had neither CD3 or TCR surface 



expression. In a study of adult T-ALL associated with HTLV-1 

infection in Japan, all samples expressed surface TCR, albeit with 
a low fluorescence intensity (68). In this study, 7/12 patient sam- 
ples expressed TCR on the surface on >95% of their blasts. Thus, 
the disease association with differentiation status holds, and we can 
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State that a third of all patients with T- ALL are likely to have surface 
TCR. Unfortunately, our acuity in diagnosis of T- ALL according to 
the degree of cellular progression along the T-cell developmental 
pathway has not resulted in any useful differentiation of outcome 
status, with the possible exception of a slightly better outcome for 
CD10+ disease (69). 

We propose a second method for targeting certain sub-types 
of T-ALL that express a surface T-cell receptor. We hypothesize 
that deletion of a specific family or sub-family of TCR-bearing 
ALL, for example the TCRBV5-4 family that ranked the highest 
of all differentially expressed genes in Table 2, would effectively 
eliminate that leukemic clone. Using CART-Ts specific for a single 
TCR family would result in only a partial depletion of normal T- 
cells, and the physiological consequences with respect to induced 
immunosuppression maybe minimal. This is in contrast to current 
CD19-CAR-T therapy for B-ALL, where the entire healthy B-cell 
compartment is ablated. Alternatively, the generation of target lists, 
such as we present here may be exploited in the future when a new 
generation of immunotherapeutic agents where more than one 
set of antigens is required to fully activate an effector T-cell-can be 
developed. The creation of a treatment strategy for prostate cancer 
that requires recognition of both PMSA and PSCA to fully trigger a 
dual-engineered T-cell population may be one such approach ( 70) . 
Our next step will be to validate surface expression of the tran- 
scripts described here by flow cytometry, and thus complement 
gene expression data with the demonstrated presence of protein 
that could be targeted on the surface of pediatric T-ALL. 

Finally, we also explored the effect of altering the normal tis- 
sue comparator from a collection of normal tissues throughout 
the body, to two more restricted normal tissue sample sets. The 
first of these was the CD34-I- HSC compartment in bone marrow. 
As this is the tissue from which the leukemia arises, one might 
anticipate that a more tumor-like set of transcript would be iden- 
tified. However, the transcripts remained very much T lymphocyte 
specific, and up to 70% of the transcripts were shared when com- 
parison was made between T-ALL and normal HSCs (Table 2 vs. 
Table 3). We did see a noticeably different set of transcripts identi- 
fied when peripheral blood lymphocytes were used as the normal 
tissue comparison. The T lymphocyte specific transcripts were 
lost and a more generic tumor-like set of transcripts were noted. 
Interestingly, the CD3 -delta chain of the TCR complex remained 
highly over-expressed in all comparisons, as did TALLA-1 (in 2 of 
3) and HHIP (in 3 of 3). TALLA- 1 has been described previously as 
a T-cell leukemia specific antigen and our analysis confirms that 
this is a high priority target worthy of further exploration (61). 
Shared between all comparisons was HHIP. HHIP has been ele- 
gantly described as a modulator of SHH signaling, adding another 
level of regulation to the patched (PTCl) and SMO pathway that 
modulates expression of Hedgehog (Hh)-regulated genes (52). A 
report demonstrating participation of this pathway in regulating 
myeloid leukemia supports further research into the role of HHIP 
in lymphocytic leukemia (53). In conclusion, there may be no 
single "correct" normal tissue to use as a comparator for gene 
expression profiles from leukemia samples. Instead, we should 
view each comparison as a means to identifying unique aspects of 
leukemia-associated gene expression patterns. These approaches 
highlight that general tissue comparisons yield results that reflect 



the tissue of origin of the malignancy, as seen in Table 2, while com- 
parison to similar tissues types (in this case mature lymphocytes) 
reveals transcripts that are biased toward those that are associated 
with tumorigenesis, Table 3. However, both types of targets are of 
value, especially in the context of immunotherapy, where finding 
tumor-specific antigens that spare the host from fatal toxicity is the 
goal. If we restricted ourselves to the HSC or lymphocytic com- 
parators in studying B-ALL, we would have missed antigens like 
CD19 and CD22 that are currently being targeted in experimen- 
tal trials. When an antigen is found in all comparator scenarios, 
such as HHIP, it certainly warrants further investigation. This is 
also the case for TALLA-1, although it may not differ sufficiently 
from normal tissue. The antigens identified by using a broad tissue 
sample are certainly valuable, but as discussed, must be evaluated 
for the impact of targeting the normal tissues that may share these 
antigens, such as mature T-cell subsets. 

In the Pediatric Cancer Genomes project, a database of preva- 
lent mutations found in pediatric cancers, including T-ALL is 
being assembled^ . Currently, this database is very useful for identi- 
fying mutations, which differs from our analysis, focusing on gene 
expression levels relative to normal tissue data sets. The largest 
collection of T-ALL samples at the Pediatric Cancer Genomes 
database is derived from ETP-ALL (9), and when this and other 
datasets are explored more closely it is apparent that only a very 
small minority of the genes identified are cell surface membrane 
proteins, with one exception, IL-7R. We also identified in IL-7R 
in T-ALL when HSC was used as a comparator in the fold- 
change analysis (Table 3), and as hit number 32 in the comparison 
run with our broad normal tissue panel (Table 2). This high- 
lights again that different targets of interest will be identified 
when different normal comparators are used. This also encourages 
mining the "normal" T-ceU developmental antigens we have iden- 
tified for potentially harboring oncogenic mutations as well. The 
entire data in our fully annotated (including membranous pro- 
tein assignment) gene expression database where users can search 
specific genes and compare gene expression between T-ALL, B- 
ALL, HSC, PBMC, and a range of normal tissue are available at: 
http://home.ccr.cancer.gov/oncology/oncogenomics/. 
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